Random Sampling from Database Files: A Survey
نویسندگان
چکیده
In this paper we survey known results on algorithms, data structures, and some applications of random sampling from databases. We first discuss various reasons for sampling from databases, and for inclusion of sampling as a DBMS operator. We consider basic sampling algorithms, sampling from trees, sampling from hash tables, and auxiliary memory resident index information to facilitate sampling.
منابع مشابه
Random Sampling from B+ Trees
We consider the design and analysis of algorithms to retrieve simple random samples from databases. Specifically, we examine simple random sampling from B+ tree files. Existing methods of sampling from B+ trees, require the use of auxiliary rank information in the nodes of the tree. Such modified B+ tree files are called “ranked B+ trees”. We compare sampling from ranked Bt tree files, with new...
متن کاملRandom Sampling from Databases - A Survey
This paper reviews recent literature on techniques for obtaining random samples from databases. We begin with a discussion of why one would want to include sampling facilities in database management systems. We then review basic sampling techniques used in construct-join are then described. We then describe sampling for estimation of aggregates (e.g., the size of query results). Here we discuss...
متن کاملPoster 2016: The effect and value of sublingual immunotherapy: a patient survey
Methods A survey was sent to a random sample of 1,400 patients obtained from the AAOL newsletter database of 4,500 patients. The 20 question survey assessed patient demographics, perceived value of treatment, medication use, health and utilization ratings, compliance, school/work attendance, hospitalizations and unplanned physician visits and health related measures such as energy, sleep, and e...
متن کاملA Bayesian Nominal Regression Model with Random Effects for Analysing Tehran Labor Force Survey Data
Large survey data are often accompanied by sampling weights that reflect the inequality probabilities for selecting samples in complex sampling. Sampling weights act as an expansion factor that, by scaling the subjects, turns the sample into a representative of the community. The quasi-maximum likelihood method is one of the approaches for considering sampling weights in the frequentist framewo...
متن کاملSampling design for an integrated socioeconomic and ecological survey by using satellite remote sensing and ordination.
Environmental variability is an important risk factor in rural agricultural communities. Testing models requires empirical sampling that generates data that are representative in both economic and ecological domains. Detrended correspondence analysis of satellite remote sensing data were used to design an effective low-cost sampling protocol for a field study to create an integrated socioeconom...
متن کامل